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Abstract. Subtraction of aligned images is a means to assess changes in a wide variety of 
clinical applications. In this paper we explore the information theoretical origin of Mutual 
Information (MI), which is based on Shannon's entropy. However, the interpretation of 
standard MI registration as a communication channel suggests that MI is too restrictive 
a criterion. In this paper the concept of Mutual Information (MI) is extended to (Nor- 
malized) Focussed Mutual Information (FMI) to incorporate prior knowledge to overcome 
some shortcomings of MI. We use this to develop new methodologies to successfully address 
specific registration problems, the follow-up of dental restorations, cephalometry, and the 
monitoring of implants. 

Keywords: image registration, registration criteria, information theory, entropy, mutual 
information, piecewise rigid, prior knowledge, dentistry, cephalometry, implants, digital 
subtraction radiography. 
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1. Introduction 

In a wide variety of clinical applications subtraction of well-aligned (or well-registered) 
images is a standard tool to monitor an evolution [TT]. We study in this paper the popular 
registration criterion Mutual Information (MI) [20], which is based on the joint entropy of 
images. Shannon [18j introduced this type of entropy in the context of a communication 
channel and Collignon [3J applied Shannon's model to Ml-registration. On the one hand, 
modeling image registration using Mutual Information (MI) as a functional on a "commu- 
nication channel" imposes unwanted restrictions in the context of image registration, and 
on the other hand, the basic mutual information is not rich enough: spatial information is 
lost, and it does not incorporate prior knowledge. In traditional Mutual Information [20] 
the gray value combinations of all pixels in the overlap of reference image and floating test 
image are equally weighed. In a probabilistic context this means that equal probability is 
attributed to each pixel in the overlap. In [8j we studied weighing and introduced Gauss 
Focussed Mutual Information (GFMI). Here we place this in a probabilistic context, and in- 
troduce mutual information with respect to a more general non-uniform distribution. Prior 
knowledge related to a specific registration problem is translated into a sampling distribu- 
tion emphasizing the contribution in the neighborhood of structures the practitioner wants 
to align or structures that might contribute to the alignment, and reduce the contribution 
of regions unimportant to the practitioner or unimportant to the alignment. When moni- 
toring implants or dental restorations, an obvious element of prior knowledge is the radio 
opacity of these implants and restorations. When 2D/3D images of a hollow longitudinal 
bone structure are to be aligned it is natural to use edge detection to model the geometry 
of the bone. This geometry can be brought into the alignment process as prior knowledge 
through the sampling distribution. In dental bitewing images used as a means to assess 
the evolution of local phenomena, such as monitoring (small) dental lesions |8], it is the 
practitioner who has to provide the information about which structure (tooth) that has to 
be aligned. In Jacquet et al. [8J Gaussian Focussed Mutual Information (GFMI) is com- 
pared to a Region Of Interest (ROI) approach. It was shown that the former approach in 
combination with registration based on affine transformations is particularly well suited to 
align rigid parts when the context of the underlying structure is relevant to the alignment. 
When ROI is used without prior segmentation the resulting MI (and therefore also the 
quality of the registration) is highly dependent on the amount of background contained in 
the ROI. 

In Section [2] image registration, the alignment of images, is formally defined. Intrinsic 
registration methods are introduced in Section [3l joint entropy of images in Section |H In- 
formation theory [18] is briefly presented in Section [51 In Section [6] mutual information 
based registration is placed in this information theoretical context, and extended to incor- 
porate prior knowledge. In Section [7] we use this extension to develop new methodologies 
to successfully address specific registration problems, the follow-up of dental restorations, 
cephalometry, and the monitoring of mandibular implants. The same ideas can be used 
for registration of 3D images; currently we are developing software and test strategies for 
hip-, knee-, and shoulder implants. We do not address issues of medical interpretation and 
diagnosis. 
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2. Image registration 

In practice, a gray scale image is a rectangular array of pixels and a function u assigning 
one of K gray value bins {1 , ■ ■ ■ , K } to each pixel. It can be considered as the discretiza- 
tion of a continuous function on a subset A of IR^, with values in the interval [0, 1]. Let u 
and V be two (continuous) images with domains A and B C IR^ respectively. Let .4 be a 
class of smooth bijective mappings from IR^ to IR^. The idea of image registration is to find 
a transformation T E A such that image u and the transformed image vt '■= voT~^ are as 
similar as possible. Given a measure I of similarity between images, optimal registration 
can be defined as the problem to find 

arg max/f-u, vt), 

the domain-transformations maximizing this measure. In this setting we will refer to u as 
the reference image, v as the test image, and vt the floating image. Different classes of 
transformations are required depending on the registration applications. 

3. Intrinsic registration methods 

In Maintz and Viergever a classification of registration methods is introduced. They 
call a method "intrinsic" when it relies only on patient generated image content, and "ex- 
trinsic" when objects foreign to the patient are introduced into the scene of which an 
image is taken to serve as reference to the alignment process. The intrinsic methods are 
split into landmark based, segmentation based, and voxel/pixel property based registration 
methods. In landmark based and segmentation based registration corresponding structures 
are indicated or extracted from reference and test image, to be used pairwise as input for 
the alignment procedure. A voxel/pixel property based registration criterion is a criterion 
directly linked to the discrete two-dimensional gray value maps (13. ip . 

To be precise, in the process of registration we consider two discrete images, denoted by 
u and V, and transformations T, and the two-dimensional gray value maps: 

gr(m, n) := (u{T{m,n)),v{m,n)) , (3.1) 

where u is the gray value bin of an interpolation of the known values of u at T{m,n) 
within A (Fig. [T]). Several interpolating techniques can be applied. We will adopt bilinear 
interpolation. 

4. Joint entropy of image pairs 

Consider the function that counts the number of times a gray value combination occurs 
in the intersection of the reference image with the transformed test image: 

Cr(fc, i) := ti {(m, n) \ T{m, n) E A A (k, £) = (u(T(m, n)),v{m, n)) } (4.1) 

If test and reference image are equal and well aligned with respect to each other, the 
frequency of unequal pairs of pixel values or off-diagonal elements is zero. If the well- 
aligned images slightly differ, off-diagonal elements may become non-zero but are expected 
to be small. This suggests that a measure based on the function Ct can be the basis for 
aligning images. We divide Ct by the total number of pixels in the intersection of reference 
and transformed test image to eliminate the dependence on the image size. The result is 
a two-dimensional probability distribution, called the joint probability of the images. Let 
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Figure 1. Mapping of the test image pixels on the reference image through 
stransform T cf . [H]. 



Pke denote the probabihty hnked to the combination of gray value classes k and /, and let 
Pk, and p,i denote its marginal distributions, 



Pke 

Pk, 
p,e 



Efc Pke 



joint probability, 

marginal linked to the reference image, 
marginal linked to the test image. 



(4.2) 



Although pk£ has all formal aspects of a probability, it is merely a relative frequency not 
linked to a stochastic experiment at present. 



Shannon Entropy. A finite probability model is characterized by a set of n elements 
(outcomes) and their probabihties pi,P2, ■ ■ ■ ,Pn- Shannon defines a measure of uncertainty 
H on the class of all finite probability models, {{pi,P2, ■ ■ ■ ,Pn) | n G N and J2Pi = 1 }> 
satisfying: 



H{pi,p2, ■ ■ ■ ,Pn) is a non-negative continuous function of the Pi 
Hn := H{l/n, 1/n, - ■ ■ ,1/n) is an increasing function of n, 

H{pi, . . . ,Pn-l,Pn) = H{pi,--- ,Pn-l+Pn) 

+ {Pn-l+Pn)H{^^^,^^). 



i=l 



(4.3) 



Implicitly, but clear from his application: 

H{pi,p2, ...,Pn)= if(p^(i),p„(2), . . .,Pa{n)) , (4.4) 

with cr a permutation of the indices 1 ■ ■ ■ n. 

Theorem 1. If H satisfies the above requirements then: 

n 

H{pi,p2,. . .,Pn) = Pilnpi , (4.5) 



where K is a positive constant. 

The proof of this theorem can be found in [TH]. For K = 1 is H referred to as the 
Shannon entropy. The Shannon entropy measures uncertainty about a precise outcome of 
an experiment linked to a probability distribution. 
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Figure 2. Schematic diagram of a general communication system [18] 



Joint entropy. The joint entropy H(u, vt) of two images u and vt is the Shannon entropy 
of the joint probabihty {pki} of the images. The entropy H{u) of the reference image is 
the Shannon entropy of {pk»} , the entropy H{vt) is the Shannon entropy of {p,e} ■ 

When two images are well aligned but intensities of the same structure differ, the proba- 
bilities will still concentrate and the Shannon entropy will not be affected since it is invariant 
under permutation of elements - see (14.41) . Therefore a similarity measure based on the 
entropy can be applied when different image acquisition modalities are used. This is in 
addition to the absence of preprocessing, a reason why Entropy based similarity measures 
are popular in multi-modal registration applications at present. 



5. Information theory applied to image registration 



The mathematical theory of communication developed by Shannon [18] is based on a 
general communication system described by Fig. [21 Shannon attempts to build an indica- 
tor for the quality of the combination transmitter, channel and receiver. A communication 
system should be designed to enable passing all acceptable messages generated by the in- 
formation source. When a language such as English is used by the information source as 
coding system, not all choices of acceptable messages are equally likely to be generated. E.g. 
the occurrence of the sequence of symbols "My name is" is more likely than the sequence 
"Prescience is" . Shannon introduces an artificial information source consisting of a Markov 
process as a model for the English language. The introduction of the surrogate source 
allows a study of the quality and limitations of a communication system with probabilistic 
methods. The English language is not the most efficient coding system. To study a com- 
munication system in combination with an English information source, it is important to 
define a measure of information content of a phrase or of its complement - the redundancy. 
The Shannon Entropy H is build to measure how uncertain we are about a specific phrase 
(outcome), or the complement, how probable the specific phrase is when we suppose it has 
been generated by a Markov process modeling the English language. 

Let us try to understand the requirements that define H. The first requirement is conti- 
nuity: there is no clear reason to introduce "jumps" . The continuity requirement does not 
seem to be too restrictive. The second requirement states that if the number of possible 
outcomes increases, and if all outcomes are equally probable, the uncertainty about the 
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realization of one particular outcome increases. The last requirement can be understood 
from modeling of a language at different levels. E.g. modeling the language at the level of 
characters should be compatible with, but less distinctive than, modeling the language at 
the level of words. To be precise: when a model is refined, the entropy should grow. If a 
system is split into two sub-systems, the increment of the entropy is equal to the probability 
of the systems multiplied by the entropy within the system. 

Collignon [3J applied the model of a communication channel to image registration under 
the following assumptions: 

• Consider the test image to be the transmitted signal. 

• Take the reference image to be the received signal. 

• The communication channel is determined by the registration parameters. 

• Optimizing the mutual information between the signals is equivalent to the design 
of an optimal communication channel. 

• Both images are assumed to represent the same scene, and their multi-modal dif- 
ferences are considered a noise generated by the communication channel. 

Inspecting the construction of the joint probability of two images in more detail, one can 
see that it is induced by the gray value mapping g-r transforming the uniform distribution 
on the pixels T{m, n) & A from the test image into an induced probability on [1 , ■ ■ ■ , i^] x 
[1 , ■ ■ ■ , K]. Therefore a process generating the joint probability consists of the sequential 
uniform random pick of nodes T(m, n) located within the domain of the reference image 
A and the evaluation of u{T{m,n)) and v{m,n). The signal becomes the evaluations 
Ut := u{T{mt,nt)), the channel is defined through the conditional probabilities Pkut/p»ut 
where k = 1, - ■ ■ ,K. The origin of the dispersion of these conditional probabilities is 
determined by the modalities used to acquire the images, the instrument settings, and the 
quality of the alignment. The joint probability is only a first order model in the sense that 
all information concerning the location of the gray value couples in both images is ignored. 
This is comparable to the generation of a sequence of letters and spaces with the frequency 
of occurrence in the English language and its use as a model for an English phrase. Spatial 
relation of pixels corresponds to the sequential order of the characters in a sentence. This 
spatial relation is discarted altogether in MI. 

Although the effort to translate the image registration problem into a problem about 
modeling a communication channel is appealing from a theoretical validation viewpoint it 
limits the search of new criteria, and it directs the search towards the statistical estimation 
of probability densities naturally emerging from an unknown "communication system" . 

The third condition of (14. 3 p does not seem mandatory in a image modeling context. 
Therefore, the substitution of e.g. the logarithm by a square root or any other convex func- 
tion in (14.51) is an interesting line of thought, influencing the robustness of the criterion. 
Hughes and Daubechies |4j propose simpler alternative metrics based on the joint proba- 
bility of overlapping images. The introduction of the spatial relation can be achieved by 
taking into account neighboring gray value couples. This approach has been explored a.o. in 
|17j : in [T3] and [3] gradient information is calculated and incorporated into the functional, 
and in [15] gray value differences of neighboring pixels/voxels are used. The introduction 
of spatial information in MI can also be achieved through blurring of the images before 
registration. However, this blurring may cause loss of a significant amount of information. 

Another line of thought, apart from spatial considerations and robustness, is to incor- 
porate prior knowledge about scene and application by replacing the uniform sampling by 
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a sampling according to a suitable distribution. A higher probability can be attributed to 
regions required to align particularly well, or to structures relevant to the alignment. In this 
paper we shall consider non-uniform distributions, and introduce different methodologies to 
incorporate prior knowledge into the sampling distribution tailored to specific applications. 

Another issue in registration based on pixel/voxel based criteria is the image overlap 
[19j . Size and content of the overlap may change considerably during the registration. 
When dealing with images with very high aspect ratio even "small" transformations can 
have an important influence on the overlap. To reduce the sensitivity of MI to these 
changes in overlap statistics, and to minimize the resulting misalignments, the Normalized 
Mutual Information Y [TU] and the Entropy Correlation Coefficient (ECC) fTU] have been 
introduced. These criteria have shown a better robustness to changes in overlap statistics 
than MI does. Therefore, we will adapt Y to non-uniform distributions and use it in our 
applications. We will not use an analogous adaptation of the ECC, since ECC is directly 
related to Y. 



6. (Normalized) Focussed Mutual Information with respect to a density 



Shannon [18] extends the entropy of a finite probability distribution to the entropy of a 
probability distribution with density / on a domain Q: 



H{f) = - flogU) . (6.6) 



n 



Let u and w be continuous images with intersecting domains A and B, f a probability 
density function on An B and g the function assigning to each point u in the intersection 
AnB the couple of gray values g(a;) := {u{lj),w{uj)). Denote by /g the probability density 
function on [0,1] x [0,1], generated through g, and denote by fu and the probability 
density functions on [0, 1] generated by u and w, respectively. Note that fu and are the 
marginal distributions of /g with respect to y and x respectively. We define 

• Focussed Mutual Information of the images u and w with respect to a density / on 
the overlap of the images An B as follows: 

MIf{u,w) := H{fu) + H{fu,) - H{f^) , (6.7) 

• Normalized Focussed Mutual Information of the images u and w with respect to a 
density / on the overlap of the images An B a.s follows: 

Yf{u,w):= , (6.8) 

• Focussed Entropy Correlation Coefficient of the images u and w with respect to a 
density / on the overlap of the images An B a,s follows: 

These generalizations return to the original concepts if / is chosen as the uniform distri- 
bution, i.e. if there is no focussing. So Focussed Mutual Information see Eq. (16. 7p returns 
to MI introduced by Collignon [3] and Maes pOO|, Normalized Focussed Mutual Information 
see Eq. (16.81) returns to Y introduced by Studholme [IH], and and the Focussed Entropy 
Correlation Coefficient see Eq. (16. 9p returns to ECC of Maes [10] . 

The Focussed Mutual Information with respect to a density function / is an extension of 
the Gauss Focussed Mutual Information introduced by Jacquet et al. [8j. More precisely. 
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given two continuous images u and w with domains A and B respectively, and f = fi 
the convex combination of a finite number of normal density functions fi, normalized to 
be a probability density function on An B, then MIf{v,w) is the Gauss Focussed Mutual 
Information. 



Approximation of MIj, Yj, and ECCj. We consider two discrete images, denoted by 
u and V. Let {k, €) be a gray value combination. Denote by D{k, t) the set of all pixel pairs 
in the intersection that have (approximately) the gray value combination {k,i), 

D{k,e) := {{m,n)\T{m,n) e A A {k,i) = {u{T{m,n)),v{m,n)) } . 

As before in (14. ip . we define CT{k,£) as the number of elements of D{k,i) in which each 
element contributes equally (or is equally weighed). We introduce WT{k,i) as a weighed 
sum in which the contribution of each pixel pair is proportional to the focus probability 
assigned to its location, and we normalize those quantities into probabilities: 

{m,n)eD{k,e) (m,n)eD{k,e) 

CrikJ) WT{k,i) 

Pke '■— ^ / ■ ■\ ' "^kt 



E.,CT{t,j) ' ■ E.,WT{^,J) ' (6.10) 

Pk, '■= '^Pki , TTfc, := T!'kl , 

I I 

P,l ■= ^ Pkl , TT,^ := ^ Tiki . 

k k 

Here pki is the joint probability of the images, see Eq. (14.21) . vr^^ the focussed joint proba- 
bility of the images, vr^, the focussed marginal linked to the reference image, and vr,^ the 
focussed marginal linked to the test image. The corresponding ("focussed") entropies are: 

Hfiu^vx) = -Efc£ log(7rfc<?) , 

Hf{u) = -Efc vTfc, log(7rfc.) , (6.11) 
Hfivr) = -Ei! ^og{TT,e) . 
We introduce approximations of MIf{u,VT), Yfiu^vx) and ECC f{u,VT)'- 

'MIf{u,VT) = Hf{u) + Hfivr) - Hf{u,VT) , 

Hf{u) + Hf{vT) 



Yf{u,VT) 

ECCfiu.VT) 



Hf{u,VT) ' (6-12) 
2Mlf{u,VT) 



Hf{u) + Hf{vT) 



7. Applications 



In this section, we will introduce methodologies involving FMI and Digital Subtraction 
Radiography (DSR), tailored to specific clinical applications. Each of the proposed regis- 
tration methods will be a hybrid form between a landmark/segmentation and a pixel/voxel 
based method. Anatomical structures, present in reference and test image, will be used to 
define a probability distribution / on the reference image incorporating the prior knowl- 
edge of the problem. The trace distributions fr of the probability distribution / on the 
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intersection of the domains of reference image and floating test image constitute the basis 
for a pixel/voxel based registration. The registration criterion is the Normahzed Focussed 
Mutual Information Yj see Eq. (16. 8p . 

In a standard feature based registration, landmarks have to be identified both in reference 
image and test image. In contrast, the focus distribution has to be defined for the reference 
image only and it eliminates the need for accurate landmark detection and for pairwise 
landmark matching. The locations of the landmarks are used only to define the regions 
of high probability; extreme accuracy is not needed. Nevertheless it may be handy to 
add some landmarks to the test image (not necessarily accurately located), to obtain an 
initial guess T'. The search space of parameters of the transformation T is limited to a 
region located symmetrically around the parameter values of T'. In our experiments we 
will restrict to affine transformations as first order approximations. 

7.1. Dentistry and orthodontics. The detection and evolution of periodontal diseases 
and of alveolar bone changes can be facilitated through intra-oral radiography in combina- 
tion with Digital Subtraction Radiography (DSR) [13]. Non intervention therapy, obser- 
vation in combination with adequate preventive measures and therapies such as pulpa 
capping require the possibility to assess evolution. DSR is a promising technique for the 
follow-up of small lesions [6], restorations, and pulpa capping. The most common imaging 
techniques in dentistry and orthodontics are 2D. The variation in geometry between image 
acquisitions most often results in irreversible distortions from one image to the other |21j . 
Satisfactory alignment of the whole image can only be obtained if the variation in geometry 
is sufficiently small. The variation can only be small enough if the X-rayed scene can be 
considered rigid. In dental apphcations this is most often not the case, due to natural 
tooth mobility, orthodontic treatment or movement of the lower jaw with respect to the 
upper jaw. Focussing the structures around the local phenomenon under study will allow 
for a better local alignment. Such a structure can be indicated manually or by means of 
an automatic or semi-automatic procedure. For the follow-up of small lesions Jacquet et 
al. [H] explore the use of FMI based on a convex combination of Gaussian distributions in 
combination with DSR. In this study the marking of the tooth to be monitored is done 
in the reference image, manually with a digital tool. In what follows we will give a case 
study of (semi-) automatic generation of a focus distribution for monitoring the quality of 
a dental restoration over time. In Fig. [3] we see two images of the teeth of one person, 
taken two years apart. We want to register the restoration on the smaller upper molar. To 
this aim we construct a focus distribution around the restoration in the reference image 
exploiting prior knowledge, that restorations (and implants) are more radio-opaque than 
the surrounding natural material. 

Algorithm Automatic generation of a focus distribution aiming at the correct registration 
of consecutive images of a tooth restoration: 

(1) . Find (all) edges in the reference image by: 

• median filtering to eliminate "pepper and salt" noise from the reference image. 

• computation of the modulus of the gradient. 

• convolution with a Gaussian kernel. 
This results in Fig. Hlleft. 

(2) . Find a patch that contains the whole restoration: 

• segmentation using a threshold to select the restoration. 

• morphological closing and dilation. 
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Figure 3. Reference image (left) and Test image (right). 
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Figure 4. £^(ige image (left) and closed and dilated restoration (right). 

The indicator function of the patch is shown in Fig. H] right. 
(3). Create the focus distribution: 

• muhiply the patch from step (2) and edge distribution produced in step (1). 

FMI registration using this focus distribution results in Fig. [5] right, showing a well 
aligned restoration. One can think of first creating the patch selecting a part of the image 
containing the restoration, followed by edge detection and convolution. Working in this 
order we may easily create spurious edges due to the border of the indicator of the patch. 

7.2. Cephalometry. Cephalometry is used as a diagnostic tool and as a basis for treat- 
ment planning, but also to monitor and evaluate treatment results [1]. In clinical practice 
the evolution is assessed by superimposing consecutive lateral radiographs based on anatom- 
ical landmarks. 

As a case study we applied FMI registration to an example of false maxillary prog- 
nathism. A lack of growth of the mandible is corrected by means of a combined surgical 
and orthodontic treatment, where the mandibular has been advanced. A lateral radiograph 
is taken before treatment (Fig. [H] left), and a follow up lateral radiograph is taken two 
years after treatment (Fig. [H] right). The purpose of the images is the evaluation of skeletal 
stability, and orthodontic treatment. 
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Figure 5. Focus distribution f (left) and subtraction image (right). 




Figure 6. Reference image (left) and Test image (right) with indicated features. 




Figure 7. Focus distribution f (left) and subtraction image (right). 

The practitioner is asked to indicate each structure to be used in the ahgnment procedure 
with a hmited number of points (15) in the reference image (Fig. [6] left) and the test image 
(Fig. [H] right). The points in the reference image are transformed into a model of the skull 
using B-splines. The resulting image is convolved with a Gaussian kernel, and used as 
distribution in an FMI registration - see Fig. [7] left. The subtraction image shows clearly 
the effect of the orthodontic treatment, and the extent of the surgical correction - see 
Fig. [7] right. 
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In the aligning process of the lateral radiographs of the skull the input of the practitioner 
can easily be reduced or removed. The detection of the edges delineating the front and back 
of the skull can be fully automated and used as the input for the FMI registration of the 
lateral radiographs. Another line of thought is to use automatically detected landmarks in 
the reference image as prior knowledge to construct a focus distribution. The automatic 
detection of cephalometric anatomical landmarks is promising e.g. [2J and [LQ\. In combi- 
nation with the reduced need for accuracy of the localization of landmarks in a FMI they 
can provide the basis for a successful automated FMI registration algorithm. 

An even more challenging application is the use of registration of lateral images of the 
skull in treatment planning. Crucial in the decision to start the orthodontic and/or oper- 
ative treatment of an adolescent is the detection of the end-of-puberty growth sprint. For 
characterizing the growth curve we plan to study the evolution of the registration parame- 
ters, more precise, the scahng needed to adjust consecutive images of the skull. 

7.3. Follow up of implants. Digital subtraction of scans taken at time intervals can be 
used to monitor the evolution of a prosthesis with respect to its wear and anchoring in the 
skeleton, more particular to assess the relative movement of a prosthesis with respect to the 
cavity in which it resides. The use of digital subtraction could be relevant to the detection 
and follow up of aseptic loosening of implants. Focussing on the bone structure we can 
model the bone in its related surrounding soft tissue, moreover we can eliminate the effect 
of the implant from the registration procedure, using the hollow structure of the bones and 
the radio opacity of the implants as prior knowledge. 

Algorithm Automatic generation of a focus distribution aiming at the correct registration 
of the bone structure surrounding an implant: 

(1) . Find (all) edges in the reference image by: 

• median filtering to eliminate "pepper and salt" noise from the reference image. 

• computation of the modulus of the gradient. 

• convolution with a Gaussian kernel. 

This results in an edge distribution focussing all the edges. 

(2) . Find the complement of a patch covering the implant: 

• segmentation using a threshold to select the implant. 

• morphological closing and dilation. 

• creation of an indicator of the complement of the patch covering the implant. 

(3) . Create the focus distribution: 

• multiply the patch from step (2) and edge distribution produced in step (1). 

Only edges corresponding to structures not related to the implant will contribute to the 
FMI registration. The reason to focus on the bone structure is that it becomes easy to 
measure the movement of the implants when the bone structure is well aligned. In the case 
of dental implants the opposite procedure is more appropriate. It is better to register the 
implant and evaluate the evolution of the surrounding bone tissue. 3D-2D projections will 
make displacement measurements unreliable. 

Algorithm Automatic generation of a focus distribution aiming at the correct registration 
of a dental implant: 

(1). Find (all) edges in the reference image by: 

• median filtering to eliminate "pepper and salt" noise from the reference image. 

• computation of the modulus of the gradient (Fig. [9] left). 
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Figure 8. Reference image (left) and Test image (right). 
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Figure 9. Modulus of the gradient (left) and modulus convolved with a Gaussian 
kernel (right). 

• convolution with a Gaussian kernel. 

This results in an edge distribution focussing all the edges (Fig. [9] right). 

(2) . Find a patch covering the implant: 

• segmentation using a threshold (Fig. [TO] left). 

• morphological closing and dilation (Fig. [TO] right). 

(3) . Create the focus distribution: 

• multiply the patch from step (2) and edge distribution produced in step (1). 

FMI using this focus distribution results in Fig. [TT] right, demonstrating that both images 
are well aligned with respect to the dental implant. 

8. Discussion 

In this paper we have explored Mutual Information as registration criterion from its 
information theoretical origin. The parallelism put forward by Collignon [3] between im- 
age registration and the model of a communication channel remains unsatisfactory. The 
validity of MI cannot be explained from information theory. Hughes and Daubechies [1] 
identify fundamental properties of MI in the framework of multi-modal image registration, 
to introduce simpler alternative similarity measures (distance metric between equivalence 



14 



W. JACQUET & P. DE GROEN 




Figure 10. Segmentation using a threshold (left) and patch covering the implants 
(right). 





Figure 11. Focus distribution f (left) and subtraction image (right). 



classes of images). Traditional MI neglects spatial information, it is dependent on overlap, 
and does not allow for the introduction of prior knowledge. Image registration based on 
(Normalized) Focussed Mutual Information with respect to a density function is a means to 
introduce this prior knowledge. In Jacquet et al. [7] Gauss Focussed Mutual Information is 
used to eliminate the dependence on overlap. In Jacquet et al. [H] Gauss Focussed Mutual 
Information is applied to the follow-up of small dental lesions. The centers of the Gaussian 
distributions are placed manually by the practitioner on the tooth under study by means 
of a digital tool. In the present paper, several methodologies tailored to specific registra- 
tion applications are proposed. In Subsection 17.11 a dental restoration is detected through 
segmentation, and transformed into a regional focus distribution through convolution with 
a Gaussian kernel. In Subsection 17.21 the purpose of the approach is to create elongated 
foci along line structures used as references for the registration. These lines structures are 
indicated with a series of points, manually placed by the practitioner, using a digital tool. 
B-spline curves are generated using these points as control points. The B-splines modeling 
the skull are transformed into the above mentioned focus distribution by convolution with 
a Gaussian kernel. In Subsection 17.3^ two methodologies are presented. In both cases, the 
interaction between implants and the surrounding bone tissue is studied. The fact that the 
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implants are simply connected objects in the scene with a maximal radio-opacity consti- 
tute the prior knowledge. Both applications are handled in a fully automated procedure in 
which the focus is derived from the image representing the modulus of the gradient. In the 
first case the object of the study is the movement of the implant due to aseptic loosening, 
which requires focussing on the bone, and therefore, removing the implant from the focus. 
In the second case the object of the study is the evolution of the bone tissue surrounding 
an implant and therefore, focus is put on the implant. 

Further study will combine the efforts to incorporate spatial information with the intro- 
duction of prior knowledge through the sample distribution and extend it to registration 

of 3D images, such as hip-, knee-, and shoulder implants. We will elaborate the automatic 
creation of models to be used as basis for the sampling distribution. Alternative functional 
forms will be studied in order to increase robustness. Determination of the optimal number 
of gray value bins is traditionally an aspect of optimal estimation. We will explore optimal 
recombination of gray values into gray value bins from a pure registration point of view. 
Furthermore, there is no fundamental objection to the use of elastic transformations in 
combination with FMI registration. The segmentation technique used in the examples is 
extremely crude (threshold). When migrating to e.g. elastic transformations of images of 
soft tissue we will incorporate more subtle image segmentation methods. 
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